{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Simple Vector Stores - Maximum Marginal Relevance Retrieval"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook explores the use of MMR retrieval [<a href=\"https://www.cs.cmu.edu/~jgc/publication/The_Use_MMR_Diversity_Based_LTMIR_1998.pdf\">1</a>]. By using maximum marginal relevance, one can iteratively find documents that are dissimilar to previous results. It has been shown to improve performance for LLM retrievals [<a href=\"https://arxiv.org/pdf/2211.13892.pdf\">2</a>]. \n",
    "\n",
    "The maximum marginal relevance algorithm is as follows:\n",
    "$$\n",
    "\\text{{MMR}} = \\arg\\max_{d_i \\in D \\setminus R} [ \\lambda \\cdot Sim_1(d_i, q) - (1 - \\lambda) \\cdot \\max_{d_j \\in R} Sim_2(d_i, d_j) ]\n",
    "$$\n",
    "\n",
    "Here, D is the set of all candidate documents, R is the set of already selected documents, q is the query, $Sim_1$ is the similarity function between a document and the query, and $Sim_2$ is the similarity function between two documents. $d_i$ and $d_j$ are documents in D and R respectively.\n",
    "\n",
    "The parameter λ (mmr_threshold) controls the trade-off between relevance (the first term) and diversity (the second term). If mmr_threshold is close to 1, more emphasis is put on relevance, while a mmr_threshold close to 0 puts more emphasis on diversity."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "The author grew up writing essays on topics they had stacked up, exploring other things they could work on, and learning Italian. They lived in Florence, Italy and experienced the city at street level in all conditions. They also studied art and painting, and became familiar with the signature style seekers at RISD. They later moved to Cambridge, Massachusetts and got an apartment that was rent-stabilized. They worked on software, including a code editor and an online store builder, and wrote essays about their experiences. They also founded Y Combinator, a startup accelerator, and created the Summer Founders Program to give undergrads an alternative to working at tech companies.\n"
     ]
    }
   ],
   "source": [
    "from llama_index import VectorStoreIndex, SimpleDirectoryReader\n",
    "\n",
    "# llama_index/docs/examples/data/paul_graham\n",
    "documents = SimpleDirectoryReader(\"../data/paul_graham/\").load_data()\n",
    "index = VectorStoreIndex.from_documents(documents)\n",
    "\n",
    "# To use mmr, set it as a vector_store_query_mode\n",
    "query_engine = index.as_query_engine(vector_store_query_mode=\"mmr\")\n",
    "response = query_engine.query(\"What did the author do growing up?\")\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "The author grew up writing essays on topics they had stacked up, exploring other things they could work on, and learning Italian. They lived in Florence, Italy and experienced the city at street level in all conditions. They also studied art and painting, and became familiar with the signature style seekers at RISD. They later moved to Cambridge, Massachusetts and got an apartment that was rent-stabilized. They worked on software, including a code editor and an online store builder, and wrote essays about their experiences. They also founded Y Combinator, a startup accelerator, and developed the batch model of funding startups.\n"
     ]
    }
   ],
   "source": [
    "from llama_index import VectorStoreIndex, SimpleDirectoryReader\n",
    "\n",
    "documents = SimpleDirectoryReader(\"../data/paul_graham/\").load_data()\n",
    "index = VectorStoreIndex.from_documents(documents)\n",
    "\n",
    "# To set the threshold, set it in vector_store_kwargs\n",
    "query_engine_with_threshold = index.as_query_engine(\n",
    "    vector_store_query_mode=\"mmr\", vector_store_kwargs={\"mmr_threshold\": 0.2}\n",
    ")\n",
    "\n",
    "response = query_engine_with_threshold.query(\"What did the author do growing up?\")\n",
    "print(response)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that the node score will be scaled with the threshold and will additionally be penalized for the similarity to previous nodes. As the threshold goes to 1, the scores will become equal and similarity to previous nodes will be ignored, turning off the impact of MMR. By lowering the threshold, the algorithm will prefer more diverse documents."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Scores without MMR  [0.8139363671956625, 0.8110763805571549]\n",
      "Scores with MMR and a threshold of 0.8  [0.6511610127407832, 0.4716293734403398]\n",
      "Scores with MMR and a threshold of 0.2  [0.16278861260228436, -0.4745776806511904]\n"
     ]
    }
   ],
   "source": [
    "index1 = VectorStoreIndex.from_documents(documents)\n",
    "query_engine_no_mrr = index1.as_query_engine()\n",
    "response_no_mmr = query_engine_no_mrr.query(\"What did the author do growing up?\")\n",
    "\n",
    "index2 = VectorStoreIndex.from_documents(documents)\n",
    "query_engine_with_high_threshold = index2.as_query_engine(\n",
    "    vector_store_query_mode=\"mmr\", vector_store_kwargs={\"mmr_threshold\": 0.8}\n",
    ")\n",
    "response_low_threshold = query_engine_with_low_threshold.query(\n",
    "    \"What did the author do growing up?\"\n",
    ")\n",
    "\n",
    "index3 = VectorStoreIndex.from_documents(documents)\n",
    "query_engine_with_low_threshold = index3.as_query_engine(\n",
    "    vector_store_query_mode=\"mmr\", vector_store_kwargs={\"mmr_threshold\": 0.2}\n",
    ")\n",
    "response_high_threshold = query_engine_with_high_threshold.query(\n",
    "    \"What did the author do growing up?\"\n",
    ")\n",
    "\n",
    "print(\"Scores without MMR \", [node.score for node in response_no_mmr.source_nodes])\n",
    "print(\n",
    "    \"Scores with MMR and a threshold of 0.8 \",\n",
    "    [node.score for node in response_high_threshold.source_nodes],\n",
    ")\n",
    "print(\n",
    "    \"Scores with MMR and a threshold of 0.2 \",\n",
    "    [node.score for node in response_low_threshold.source_nodes],\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Retrieval-Only Demonstration\n",
    "\n",
    "By setting a small chunk size and adjusting the \"mmr_threshold\" parameter, we can see how the retrieved results\n",
    "change from very diverse (and less relevant) to less diverse (and more relevant/redundant).\n",
    "\n",
    "We try the following values: 0.1, 0.5, 0.8, 1.0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from llama_index import (\n",
    "    VectorStoreIndex,\n",
    "    SimpleDirectoryReader,\n",
    "    ServiceContext,\n",
    "    LLMPredictor,\n",
    ")\n",
    "from llama_index.response.notebook_utils import display_source_node\n",
    "from llama_index.llms import OpenAI\n",
    "\n",
    "llm = OpenAI(temperature=0, model=\"gpt-3.5-turbo\")\n",
    "service_context = ServiceContext.from_defaults(llm=llm, chunk_size_limit=64)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# llama_index/docs/examples/data/paul_graham\n",
    "documents = SimpleDirectoryReader(\"../data/paul_graham/\").load_data()\n",
    "index = VectorStoreIndex.from_documents(documents, service_context=service_context)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "retriever = index.as_retriever(\n",
    "    vector_store_query_mode=\"mmr\",\n",
    "    similarity_top_k=3,\n",
    "    vector_store_kwargs={\"mmr_threshold\": 0.1},\n",
    ")\n",
    "nodes = retriever.retrieve(\"What did the author do during his time in Y Combinator?\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "**Document ID:** 40d925c0-67fb-47eb-84f7-51728b224a6d<br>**Similarity:** 0.08476292699394482<br>**Text:** initial set of customers almost entirely from among their batchmates.\n",
       "\n",
       "I had not originally intended YC to be a full-time job. I was going to do three things: hack, write essays, and work on YC. As YC grew, and I grew more excited...<br>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "**Document ID:** 72651e88-62cc-4d99-baf8-222c05b5e129<br>**Similarity:** -0.5616228896922558<br>**Text:** and because I painted them on leftover scraps of canvas, which was all I could afford at the time. Painting still lives is different from painting people, because the subject, as its name suggests, can't move. People can't sit for more than about 15 minutes at...<br>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "**Document ID:** 0328e711-c8f7-4a91-a0c1-a372068e3f1c<br>**Similarity:** -0.5230344987656315<br>**Text:** alternative to the Turing machine. If you want to write an interpreter for a language in itself, what's the minimum set of predefined operators you need? The Lisp that John McCarthy invented, or more accurately discovered, is an answer to that question....<br>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "for n in nodes:\n",
    "    display_source_node(n, source_length=1000)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "retriever = index.as_retriever(\n",
    "    vector_store_query_mode=\"mmr\",\n",
    "    similarity_top_k=3,\n",
    "    vector_store_kwargs={\"mmr_threshold\": 0.5},\n",
    ")\n",
    "nodes = retriever.retrieve(\"What did the author do during his time in Y Combinator?\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "**Document ID:** 40d925c0-67fb-47eb-84f7-51728b224a6d<br>**Similarity:** 0.42381204797542626<br>**Text:** initial set of customers almost entirely from among their batchmates.\n",
       "\n",
       "I had not originally intended YC to be a full-time job. I was going to do three things: hack, write essays, and work on YC. As YC grew, and I grew more excited...<br>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "**Document ID:** 0328e711-c8f7-4a91-a0c1-a372068e3f1c<br>**Similarity:** 0.018193356482163803<br>**Text:** alternative to the Turing machine. If you want to write an interpreter for a language in itself, what's the minimum set of predefined operators you need? The Lisp that John McCarthy invented, or more accurately discovered, is an answer to that question....<br>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "**Document ID:** fbefd791-308a-4438-b6ec-353c2f05867b<br>**Similarity:** 0.05669398537137432<br>**Text:** and partly because I was focused on my mother, whose cancer had returned.\n",
       "\n",
       "She died on January 15, 2014. We knew this was coming, but it was still hard when it did.\n",
       "\n",
       "I kept working on YC till March, to help get that batch of startups through...<br>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "for n in nodes:\n",
    "    display_source_node(n, source_length=1000)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "retriever = index.as_retriever(\n",
    "    vector_store_query_mode=\"mmr\",\n",
    "    similarity_top_k=3,\n",
    "    vector_store_kwargs={\"mmr_threshold\": 0.8},\n",
    ")\n",
    "nodes = retriever.retrieve(\"What did the author do during his time in Y Combinator?\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "**Document ID:** 40d925c0-67fb-47eb-84f7-51728b224a6d<br>**Similarity:** 0.6781190611335854<br>**Text:** initial set of customers almost entirely from among their batchmates.\n",
       "\n",
       "I had not originally intended YC to be a full-time job. I was going to do three things: hack, write essays, and work on YC. As YC grew, and I grew more excited...<br>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "**Document ID:** 7a8189bc-ccb6-402d-8ce5-49587b13878e<br>**Similarity:** 0.49504062407907184<br>**Text:** next several years I wrote lots of essays about all kinds of different topics. O'Reilly reprinted a collection of them as a book, called Hackers & Painters after one of the essays in it. I also worked on spam filters, and did some more painting....<br>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "**Document ID:** 3ed4c422-a297-40b9-9510-68cc8f18e2c9<br>**Similarity:** 0.5017248860360811<br>**Text:** Y Combinator was not the original name. At first we were called Cambridge Seed. But we didn't want a regional name, in case someone copied us in Silicon Valley, so we renamed ourselves after one of the coolest tricks in the lambda calculus, the Y...<br>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "for n in nodes:\n",
    "    display_source_node(n, source_length=1000)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "retriever = index.as_retriever(\n",
    "    vector_store_query_mode=\"mmr\",\n",
    "    similarity_top_k=3,\n",
    "    vector_store_kwargs={\"mmr_threshold\": 1.0},\n",
    ")\n",
    "nodes = retriever.retrieve(\"What did the author do during his time in Y Combinator?\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "**Document ID:** 40d925c0-67fb-47eb-84f7-51728b224a6d<br>**Similarity:** 0.8476240959508525<br>**Text:** initial set of customers almost entirely from among their batchmates.\n",
       "\n",
       "I had not originally intended YC to be a full-time job. I was going to do three things: hack, write essays, and work on YC. As YC grew, and I grew more excited...<br>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "**Document ID:** 1a8b0250-9b62-418c-a1df-6af4454a77e7<br>**Similarity:** 0.8252174449518838<br>**Text:** already helped write the RSS spec and would a few years later become a martyr for open access, and Sam Altman, who would later become the second president of YC. I don't think it was entirely luck that the first batch was so good. You had to be pretty bold...<br>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "**Document ID:** 7d571ed4-0f23-41cd-a2fd-8a590c9e8f11<br>**Similarity:** 0.8227484107217059<br>**Text:** announcement on my site, inviting undergrads to apply. I had never imagined that writing essays would be a way to get \"deal flow,\" as investors call it, but it turned out to be the perfect source. [15] We got 225 applications for the Summer Founders...<br>"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "for n in nodes:\n",
    "    display_source_node(n, source_length=1000)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "llama_index_v2",
   "language": "python",
   "name": "llama_index_v2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
